Concept | Description |
---|---|
NoSQL | Refers to non-relational databases that can handle large volumes of rapidly changing data and scale out horizontally. They often relax strict ACID properties for performance and flexibility. |
CAP Theorem | States that a distributed system can only guarantee two of the following simultaneously: Consistency, Availability, and Partition Tolerance. Different NoSQL databases choose different trade-offs. |
BASE | “Basically Available, Soft state, Eventually consistent.” Many NoSQL databases follow BASE principles instead of strict ACID transactions to optimize for performance and scalability. |
Horizontal Scalability | NoSQL databases typically scale by adding more nodes (sharding/partitioning data), rather than vertically scaling a single server. |
Schema Flexibility | Most NoSQL databases do not enforce a rigid schema. The data model can evolve more easily as requirements change. |
Concept | Description | Schema Example |
---|---|---|
Definition | Stores data in a simple key-value pair. Good for caching and real-time applications with simple lookups. Examples: Redis, Memcached. |
Key: "user:1001" Value: "John Doe" |
Typical Use Cases | Session management, caching, leaderboard counts, token storage, quick retrieval by key. |
Example: Key: "session:abc123" Value: "{'user_id': 1001, 'expires': '2025-03-09T15:00:00'}" |
Common Commands | Set, Get, Delete by key. |
Redis Example:SET user:1001 "John Doe" GET user:1001
|
Concept | Description | Schema Example |
---|---|---|
Definition | Stores data as documents (usually JSON or BSON). Offers flexible schema and advanced querying. Examples: MongoDB, CouchDB, Firestore. |
MongoDB Document Example:{
"_id": 1001,
"name": "John Doe",
"email": "john@example.com",
"orders": [
{
"order_id": 500,
"total": 89.99
}
]
}
|
Typical Use Cases | Content management systems, user profiles, event logging, any scenario requiring flexible data structures. |
Collection: "users" Documents: represent individual user profiles, each can have different fields if needed. |
Common Commands | Insert, Find, Update, Delete (typical CRUD operations). Also supports indexing for fields. |
MongoDB Example:db.users.insertOne({ _id: 1001, name: "John Doe" }) db.users.find({ _id: 1001 })
|
Query Flexibility | Can query nested fields, arrays, and perform aggregations. Supports advanced operators like $in , $lt , $regex , etc. |
Aggregation Example:db.orders.aggregate([
{ $match: { status: "shipped" } },
{ $group: { _id: "$customerId", totalSpent: { $sum: "$amount" } } }
])
|
Concept | Description | Schema Example |
---|---|---|
Definition | Organize data into column families, which contain rows that can have varying columns. Optimized for reading/writing large volumes of data. Examples: Cassandra, HBase. |
Cassandra Table Example: Keyspace: my_app Table: users Primary key: (user_id)
CREATE TABLE users (
user_id int,
name text,
email text,
PRIMARY KEY (user_id)
);
|
Typical Use Cases | High write throughput, large-scale analytics, time-series data, event logging with predictable query patterns. |
Example: For storing sensor readings, each row can have columns for each timestamp. |
Common Commands | CQL (Cassandra Query Language) is similar to SQL. You define tables, insert/update data, use partition keys, clustering keys, etc. |
Cassandra Example:INSERT INTO users (user_id, name, email)
VALUES (1001, 'John Doe', 'john@example.com');
|
Partitioning | Data is distributed across the cluster using partition keys for horizontal scalability. Careful design of partition keys is crucial for performance. |
Partition Key Example:PRIMARY KEY ((user_id), some_other_key)
|
Concept | Description | Schema Example |
---|---|---|
Definition | Designed to store data in nodes and relationships (edges). Excellent for highly interconnected data. Examples: Neo4j, JanusGraph. |
Neo4j Schema Example: Nodes: (Person { name: "John", age: 30 }) Relationship: (John)-[:KNOWS]->(Jane) |
Typical Use Cases | Social networks, recommendation engines, fraud detection, network topologies, or anything requiring graph traversal. | Example: A "Friend of a Friend" search or shortest path between entities. |
Common Commands / Query Language | Cypher (Neo4j), Gremlin (Apache TinkerPop). Queries use pattern matching on node labels and relationships. |
Neo4j Cypher Query Example:MATCH (p:Person)-[:KNOWS]->(friend:Person)
WHERE p.name = "John"
RETURN friend;
|
Practice | Description |
---|---|
Know Your Access Patterns | Design your schema (or data model) around how data is queried. This is critical in NoSQL to optimize performance. |
Use Indexes Wisely | Indexes speed up reads but can slow writes and use additional memory/disk. Only index what you really need. |
Partition / Shard Carefully | Even data distribution across clusters is important for performance. Avoid hotspots by choosing keys that won’t concentrate loads on a single node. |
Monitor and Tune | Monitor query performance, resource usage, and replication lag. Tweak configurations (e.g., read/write consistency levels) for your workload. |
Security and Backups | Enable authentication/authorization, encrypt data at rest and in transit, and have a reliable backup/recovery strategy. |